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NNPDF1.0 parton set for the LHC 

M. Ubiali a * 

a Department of Physics, University of Edinburgh, Edinburgh, UK 

We present the first NNPDF full set of Parton Distribution Functions from a comprehensive DIS analysis. 
This approach, combining a Monte Carlo sampling of the probability measure in the space of PDFs with the 
use of neural networks as interpolating functions, provides a faithful and statistically sound determination of the 
uncertainty in parton distributions. The features of the fit and the results are discussed in details as well as some 
preliminary phenomenological analysis. 



1. INTRODUCTION 

One of the key elements for the computation of 
any observable involving hadrons are Parton Dis- 
tribution Functions (PDFs); a faithful estimation 
of their error is fundamental for producing re- 
liable phenomenological predictions at hadronic 
colliders. Especially now, with the upcoming 
LHC data and with increasingly smaller exper- 
imental uncertainties, a careful consideration has 
to be given to theoretical uncertainties. 

The traditional approach for PDFs fitting[TJ 
I2l3j suffers of some drawbacks which have not 
been completely solved. In particular, bench- 
mark comparison performed between some of 
those sets [4] shows that benchmark partons de- 
termined on restricted data sets and global fit 
partons do not agree within error. This makes 
the uncertainty bands not easily interpretable in 
a statistical sense. This and other difficulties have 
stimulated various proposals for new approaches. 

The NNPDF approach is one of them. It has 
been introduced in the context of the parametri- 
sation of DIS structure function data [5|6j and, 
after having been successfully applied in the de- 
termination of the non-singlet distribution [7], it 
has been used for the construction of a full par- 
tonic set from DIS data [8]. The details of the 
methodology and analysis are widely explained 
in Ref [8]. In the following section we briefly de- 
scribe the method and the features of the fit by 
concentrating especially on the results. 



2. THE NNPDF1.0 PARTON SET 

The determination of the NNPDF1.0 set is 
based on a full set of deep-inelastic scattering 
data, with various lepton beams and nuclcon tar- 
gets for a total of more than 3000 measurements 
coming from 7 different experiments. The kine- 
matic coverage of data sets included in the anal- 
ysis are shown in Fig. [TJ The observables used in 
our fit are either structure functions or reduced 
cross-sections, including neutrino reduced cross 
section. 
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Figure 1 . Experimental data used in the analysis 
after kinematical cuts. 



In order to propagate the error from the ex- 
perimental data to the fit we build a sampling 
of the probability distribution defined by the ex- 
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perimental data. To do this, we generate N Tep 
sets of artificial data distributed according to a 
multi-gaussian distribution centered on the origi- 
nal data and with variance determined by exper- 
imental uncertainties. The accuracy of the sta- 
tistical sampling is quantified by mean of statis- 
tical estimators which indicate that a sample of 
0(1000) replicas is enough to reproduce the mean 
values, the variance and the correlations of the 
experimental data within a 1% accuracy. 

For each replica we fit the artificial data by 
evolving the PDFs from the starting scale to the 
scale of the experimental points. Each of the 
five input PDFs is parameterized by a multi-layer 
neural network. The latter provide nothing but 
a redundant and unbiased parameterization for 
the PDFs at the initial scale. In principle, any 
other redundant parametrisation with the same 
features would be suitable. 

The determination of the best fit in case of 
a redundant parameterization is a delicate issue 
because it might adapt not only to the physi- 
cal behaviour but also to statistical fluctuations. 
Therefore the best fit is given by an optimal 
training, beyond which the figure of merit im- 
proves by learning the statistical noise of the data. 
We address this issue through a so called cross- 
validation method, based on a random division 
of data into a training and a validation set. The 
first set is the one on which we actually mini- 
mize the figure of merit, which is a function of 
the weights of the nets. The latter is evaluated 
not only on the training set but also over the val- 
idation set at each iteration of the minimisation, 
as a monitor. In fact, when the error function of 
the training set still decreases and the validation 
one starts increasing we have reached the opti- 
mal fit. It is extremely important that the best 
fit is not determined as the absolute minimum 
for a given functional form; in this way incon- 
sistent data or underestimated uncertainties are 
automatically accounted for and signalled by a 
larger than average value of the \ 2 per degree of 
freedom and do not require a separate treatment. 

At the end of this process we end up with a 
set of -/V rop trained neural networks which pro- 
vides a representation of the probability density. 
In Fig. [2] it is shown that, even though individ- 



ual replicas may fluctuate significantly because 
of the flexibility of the parameterization, average 
quantities such as central values and error bands 
are smooth inasmuch as stability is reached due 
to the number of replicas increasing. Therefore 
any statistical property of the parton distribu- 
tions themselves or of any function of them can 
be calculated using standard statistical methods. 
It is thus easy to compute any desired property 
such as correlations or to assess the stability of the 
fit under the variation of the number of param- 
eters describing the basis PDFs. It is also easy 
to restrict our fit on a subset of data and verify 
that, while the uncertainty bands do increase in 
the region where there are less data, the central 
values remain compatible within uncertainty [§]. 
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Figure 2. Set of 25 replicas (top) and 100 repli- 
cas (bottom) of the gluon distribution. The solid 
red line show the central value and the one-sigma 
interval computed from each set. 
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With the method briefly described we have pro- 
duced the NNPDF1.0 set of parton distributions. 
In our analysis we fit five independent PDFs cor- 
responding to the two light flavors and the gluon. 
We assume the strange distribution to be pro- 
portional to the light sea given that the data 
sets included in the analysis give little constraints 
on the strange PDFs. Besides, we determine 
all heavy quark PDFs from perturbative evolu- 
tion, generating them dynamically according to 
the ZM-VFN scheme and therefore neglecting in- 
trinsic heavy flavor contributions. Evolution is 
performed at next-lo-leading order from the ini- 
tial scale Qq =2 GeV 2 =m 2 . The quality of the 
central fit, obtained averaging over all PDFs in 
the sample is measured by its % 2 = 1.34. 
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Figure 4. The valence, triplet and sea asymmetry 
PDF at the initial scale Ql = 2 GeV 2 
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Figure 3. The singlet and gluon PDF at the initial 
scale Ql = 2 GeV 2 . 



The five parton distributions which constitute 
our basis set are displayed in Fig. [3] and |4] as a 
function of x at the fixed initial scale. Our re- 
sults compared to those of the most recent NLO 



parton sets [1I2I3] are in reasonable agreement es- 
pecially in the data region. Uncertainties of PDFs 
tend to be larger in the region where no data are 
available, while in the data region they tend to 
be generally little larger, in some cases compa- 
rable or even smaller. Note that the uncertainty 
bands are found without introducing any toler- 
ance criterion, which would correspond to an up- 
ward rescaling of all experimental uncertainties. 

In Fig.[5]the theoretical prediction obtained us- 
ing NLO QCD and the NNPDF1.0 set is com- 
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Figure 5. Comparison of NLO theoretical predic- 
tions and data for one of the observables included 
in the fit. Ff at Q 2 = 15 GeV 2 (top) and NC re- 
duced cross section at Q 2 = 6.5 GeV 2 and ^fs = 
301 GeV (bottom). 



pared to the data, for some representative deep- 
inelastic observables included in the fit. The dif- 
ferences between predictions obtained using dif- 
ferent parton sets are smaller for these observ- 
ables than they are for the parton distributions 
themselves, as it should be given that these data 
have been used in the determination of all the 
partonic sets with the exception of the CHORUS 
data. In Fig. [5] we show as an illustrative exam- 
ple the total cross sections for the Z production 
at the LHC. All cross sections have been com- 
puted at NLO using MCFM [9], using a sample 
of iVrop = 100 replicas, which is fully adequate 
for this purpose. We find good agreement of cen- 
tral values with the CTEQ6.1 computation, as 
expected, given that it uses a ZM-VFN number 
scheme for heavy quarks as we do. 

The NNPDF1.0 is the first full parton set based 
on this new approach and it is available in the 
LHAPDF interface [TO]. It can be further im- 
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Figure 6. Comparison of NLO theoretical predic- 
tions for the Z-production total cross section at 
the LHC. 



proved in many respects. First of all a wider 
set of data besides DIS should be included as 
well as heavy quark thresholds should be treated 
more accurately. On aside, a set of NNLO parton 
distributions should be produced, both with the 
purpose of estimating uncertainties on the NLO 
results and also for some precision applications; 
large and small x resummation corrections should 
also be considered. 
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